Learning to combine the modalities of language and video for temporal moment localization

نویسندگان

چکیده

Temporal moment localization aims to retrieve the best video segment matching a specified by query. The existing methods generate visual and semantic embeddings independently fuse them without full consideration of long-term temporal relationship between them. To address these shortcomings, we introduce novel recurrent unit, cross-modal long short-term memory (CM-LSTM), mimicking human cognitive process localizing moments that focuses on part related query, accumulates contextual information across entire recurrently. In addition, devise two-stream attention mechanism for both attended unattended features input query prevent necessary from being neglected. obtain more precise boundaries, propose attentive interaction network (TACI) generates two 2D proposal maps obtained globally integrated features, which are generated using CM-LSTM, locally boundary score sequences then combines into final map in an end-to-end manner. On TML benchmark dataset, ActivityNet-Captions, TACI outperforms state-of-the-art with [email protected] 45.50% 27.23% protected], respectively. show revised state-of-the-arts replacing original LSTM our CM-LSTM achieves performance gains.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

the relationship between using language learning strategies, learners’ optimism, educational status, duration of learning and demotivation

with the growth of more humanistic approaches towards teaching foreign languages, more emphasis has been put on learners’ feelings, emotions and individual differences. one of the issues in teaching and learning english as a foreign language is demotivation. the purpose of this study was to investigate the relationship between the components of language learning strategies, optimism, duration o...

15 صفحه اول

willingness to communicate in the iranian context: language learning orientation and social support

why some learners are willing to communicate in english, concurrently others are not, has been an intensive investigation in l2 education. willingness to communicate (wtc) proposed as initiating to communicate while given a choice has recently played a crucial role in l2 learning. it was hypothesized that wtc would be associated with language learning orientations (llos) as well as social suppo...

the relationship between locus of control and iranian efl university students’ beliefs about language learning

this exploratory study aimed to investigate a possible relationship between learners’ beliefs about language learning and one of their personality traits; that is,locus of control (loc). both variables, beliefs and locus of control, are assumed to influence the language learning process. the internal control index (ici) and the beliefs about language learning inventory (balli) were administered...

a synchronic and diachronic approach to the change route of address terms in the two recent centuries of persian language

terms of address as an important linguistics items provide valuable information about the interlocutors, their relationship and their circumstances. this study was done to investigate the change route of persian address terms in the two recent centuries including three historical periods of qajar, pahlavi and after the islamic revolution. data were extracted from a corpus consisting 24 novels w...

15 صفحه اول

the relationship between locus of control, efl reading and writing achievement, and use of language learning strategies

تاثیر منبع کنترل رفتار بر روی موفقیت تحصیلی زبان آموزان انگلیسی بندرت در ایران ارزیابی شده است. هدف این تحقیق ارزیابی رابطه بین منبع کنترل رفتار زبان آموزان انگلیسی و موفقیت آنها در مهارت های خواندن و نوشتن زبان انگلیسی است. محقق همچنین رابطه بین منبع کنترل رفتار و راهبردهای یادگیری زبان انگلیسی را در این تحقیق جستجو می کند. بدین منظور محقق ابتدا نسخه فارسی پرسشنامه منبع کنترل رفتار (duttweiler...

15 صفحه اول

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Computer Vision and Image Understanding

سال: 2022

ISSN: ['1090-235X', '1077-3142']

DOI: https://doi.org/10.1016/j.cviu.2022.103375